Convergence of Stochastic Iterative Dynamic Programming Algorithms
نویسندگان
چکیده
Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of these methods has been missing. In this paper we relate DP-based learning algorithms to the powerful techniques of stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD("\) and Q-Iearning belong.
منابع مشابه
Learning Algorithms for Risk-Sensitive Control
This is a survey of some reinforcement learning algorithms for risk-sensitive control on infinite horizon. Basics of the risk-sensitive control problem are recalled, notably the corresponding dynamic programming equation and the value and policy iteration methods for its solution. Basics of stochastic approximation algorithms are also sketched, in particular the ‘o.d.e.’ approach for its stabil...
متن کاملRobust inter and intra-cell layouts design model dealing with stochastic dynamic problems
In this paper, a novel quadratic assignment-based mathematical model is developed for concurrent design of robust inter and intra-cell layouts in dynamic stochastic environments of manufacturing systems. In the proposed model, in addition to considering time value of money, the product demands are presumed to be dependent normally distributed random variables with known expectation, variance, a...
متن کاملDifferential Dynamic Programming for Solving Nonlinear Programming Problems
Dynamic programming is one of the methods which utilize special structures of large-scale mathematical programming problems. Conventional dynamic programming, however, can hardly solve mathematical programming problems with many constraints. This paper proposes differential dynamic programming algorithms for solving largescale nonlinear programming problems with many constraints and proves thei...
متن کاملStrong convergence of modified iterative algorithm for family of asymptotically nonexpansive mappings
In this paper we introduce new modified implicit and explicit algorithms and prove strong convergence of the two algorithms to a common fixed point of a family of uniformly asymptotically regular asymptotically nonexpansive mappings in a real reflexive Banach space with a uniformly G$hat{a}$teaux differentiable norm. Our result is applicable in $L_{p}(ell_{p})$ spaces, $1 < p
متن کاملApplication of DJ method to Ito stochastic differential equations
This paper develops iterative method described by [V. Daftardar-Gejji, H. Jafari, An iterative method for solving nonlinear functional equations, J. Math. Anal. Appl. 316 (2006) 753-763] to solve Ito stochastic differential equations. The convergence of the method for Ito stochastic differential equations is assessed. To verify efficiency of method, some examples are ex...
متن کامل